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ABSTRACT 

This paper develops a novel Fault Recovery(FR) architecture for Motion Analysis Computing 
Arrays (MACA). Any single fault in each Processing Element (PE) in an MACA can be effectively 
detected and corrected using tlie concept of Dual-Remnant codes i.e., Remnant and Proportionate (RP) 
code. A Good Example is the H.264 video compression standard, also known as MPEG-4 Advanced 
Video Coding application. It uses a context-based adaptive method to speed up the multiple reference 
frames Motion Analysis by avoiding unnecessary reference frames computation. A large PE array 
accelerates the computation speed especially in High Resolution devices such as HDTV(High Definition 
Television). The Visual Quality and Peak Signal-to-Noise Ratio (PSNR) at a given bit rate are influenced 
if a fault occurred in MA process. 
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I. INTRODUCTION 

Improved advancements in semiconductors, digital signal processing, and communication technologies 
have made multimedia applications more flexible and reliable. A good example is the H.264 or MPEG-4 Part 10 
Advanced Video Coding, which is the next generation video compression [1],[2]. that is necessary for a wide 
range of applications to reduce the total data amount required for transmitting or storing video data. Among the 
coding trends, a MA is of high importance in exploiting the temporal redundancy between subsequent frames 
that provides the less computation time for coding. Moreover, while performing up to enormous computations 
encountered in the entire coding system, a MA is widely regarded as the most computationally intensive of a 
video coding system [3]. A MA generally consists of PEs with a size of NxN. Thus, increasing the speed of 
manipulation towards a high dimension of PE array, particularly in devices having more resolution factor with a 
large as N=4 for HDTV(High Definition Television) [4] search range. Also, the video quality and peak signal- 
to-noise ratio (PSNR) are influenced for a given bit rate if a fault obtained in MA process. A testable design is 
thus increasingly important to ensure the reliability of numerous PEs in a MA. Moreover, although the advance 
of VLSI technologies facilitate the integration of a large number of PEs of a MA into every chip, the logic -per- 
pin ratio is consistently increased, thereby slightly decreasing the logic testing efficiency on the chip. For the 
commercial purpose, it is mandatory for the MA to enhance Design For Testability (DFT) [5]-[7]. DFT 
concentrates on improving the usage of testing the devices, thus makes the system highly reliable. DFT 
techniques depend on circuit re-configuration during testing to enhance the features of testable nature. Hence 
Design For Testability methods improves the testability design of circuits [8]-[10]. Due to recent trends in 
micron technologies and increasing complexity of electronic systems and circuits, the Built-in Self-Test (BIST) 
schemes have supremely become necessary in this modern digital universe[l 1]-[14]. The rest of this paper is 
organized as follows. Section 2 describes the overall Fault Recovery system. Section 3 & 5 then describes the 
various modules and numerical example consideration of Fault Recovery process . Next, Section 4 generalizes 
the methodology in designing of Fault Recovery process. Section 6 & 7 formulates the Simulation set up and its 
results Section 8 evaluates the results and its discussion to demonstrate the feasibility of the proposed Fault 
Recovery architecture for MA testing applications. Conclusions are finally drawn in Section 9. 

II. FAULT RECOVERY DESIGN 

The conceptual view of the proposed Fault Recovery scheme, which comprises two major circuit 
designs, i.e. Fault Detection Circuit (FDC) and data recovery circuit (DRC), to identify faults and recover the 
corresponding data in a specific CUT. The test code generator (TCG) in Fig. utilizes the concepts of RP code to 
generate the corresponding test codes for fault identification and data recovery. In other words, the test codes 
from 
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TCG and the primary output from CUT are delivered to FDC to determine whether the circuit under 
test has faults. DRC is in charge of recovering data from TCG. Additionally, a selector is enabled to export 
fault-free data or data recovery results[15],[16]. Importantly, an array-based computing structure, such as MA, 
discrete cosine transform (DCT), iterative logic array (ILA), and finite impulse filter (FIR), is feasible for the 
proposed Fault Recovery scheme to identify faults and recover the corresponding data. In our proposed circuit 
the output will be gating in second clock cycle not a 22 th clock cycle, because we change the RP block structure. 
Also the proposed Fault Recovery design for MA testing can identify faults and recover data with an acceptable 
area and time limit. Importantly, the proposed Fault Recovery design performs satisfactorily in terms of 
throughput and reliability for MA testing applications. 
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Figure 1 . Design Layout of Fault Recovery Scheme 



III. MODULE DESCRIPTION 

3.1. PROCESSING ELEMENT 

A MA (Motion Analysis) consists of many PEs incorporated in a 1 -D or 2-D array for video encoding 
applications. A PE generally consists of two ADDs (i.e. an 8-b ADD and a 12-b ADD) and an accumulator 
(ACC). Next, the 8-b ADD (a pixel has 8-b data) is used to estimate the addition of the current pixel (Cur pixel) 
and reference pixel (Ref_pixel). Additionally, a 12-b ADD and an ACC are required to accumulate the results 
from the 8-b ADD in order to determine the sum of absolute difference (SAD) value for video encoding 
applications 

3.2. SUM OF ABSOLUTE DIFFERENCE TREE 

We propose a 2-D intra-level architecture called the Propagate Partial SAD. This Architecture is 
composed of PE arrays with a 1 -D adder tree in the vertical direction. Current pixels are stored in each PE, and 
two sets of continuous reference pixels in a row are broadcasted to PE arrays at the same time. In each PE array 
of a Adder tree, harmonics are identified and added by a adder tree to generate single row SAD. The row SADs 
are accumulated and propagated with propagation registers in the vertical direction The reference data of 
searching candidates in the even and odd columns are inputted by Ref. Pixel and Ref Pixel 1 . Then the SAD of 
the initial search candidate in the zeroth column is generated, and the SADs of the other searching candidates 
are sequentially generated in the following cycles. When computing the last searching candidates in each 
column, the reference data of searching candidates in the next columns begin to be inputted by another input 
reference, while navigating in partial SAD, by sending reference pixel rows and also partial row SADs in the 
vertical scale direction, it gives the usage of lesser reference registers and a minimum critical path. 
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3.2.1. SUM OF ABSOLUTE DIFFERENCE VALUE CALCULATION 

By utilizing PEs, SAD shown in as follows, in a macro block with size N X N of can be evaluated 

IV- 1 N-l 

i=Q j=0 
JV-1 N-l 

= E £ I (Qxij-m+r^j)- (^-m+r^) I 

Where r xi j,q xi j and r y y ,q y y denote the corresponding RP code of Xy , Yy and modulo M. Importantly, and 
represent the luminance pixel value of Cur_pixel and Ref_pixel subsequently. 

3.3. REMNANT PROPORTIONATE CODE GENERATION ALGORITHM 

In this RPCG Algorithm Remnant code is generally separable arithmetic codes by estimating a remnant 
for data and appending it to data [17], [18]. Fault detection logic for operations is typically derived by a separate 
remnant code, making the detection logic is simple and easily implemented. Fault detection logic for operations 
is typically derived using a separate remnant code such that detection logic is simply and easily implemented. 
However, only a bit fault can be detected based on the remnant code. Additionally, a fault can't be recovered 
effectively by using the remnant codes. Therefore, this work presents a proportionate code, which is derived 
from the remnant code; to assist the remnant code in rectifying multiple faults [19]. The corresponding circuit 
design of the RPCG is easily realized by using the simple adders (ADDs). Namely, the RP code can be generated 
with a low complexity and little hardware cost [20]. 

3.4. TEST CODE GENERATION 

TCG is an important component of the proposed Fault Recovery Design. Notably, TCG design is based 
on the ability of the RPCG Circuit to generate corresponding test codes in order to identify faults and recover 
data. 

3.5. FAULT DETECTION CIRCUIT 

In this module indicates that the operations of fault detection in a specific PE ; is achieved by using 
FDC, which is utilized to compare the outputs between TCG and in order to determine whether faults have 
occurred. The FDC output is then used to generate a 0/1 signal to indicate that the tested PE; is fault-free/faulty. 
Using XOR operation can be identify the fault if any variation in terms of remnant and proportionate value. 
Because a fault only affects the logic in the fan-out cone from the fault region. Concurrent fault simulation 
exploits this fact and simulates only the differential parts of the whole circuit Concurrent fault simulation is 
essentially an event-driven simulation with the fault-free circuit and faulty circuits simulated altogether. 

3.6. DATA RECOVERY CIRCUIT 

In this module will be generate fault free output by proportionate multiply with constant value and add 
with proportionate code. During data recovery, the circuit DRC plays a significant role in recovering RP code 
from TCG. 

IV. METHODOLOGY 

Coding approaches such as Parity code, Berger code, and Remnant code is considered only to identify 
circuit faults. Remnant code R is a separable arithmetic codes by estimating a remnant for data and appending it 
to data, i.e., R = IXI m .Binary Data X is coded as a pair (X,R) and modulus m= 2 W -1 , w is word length. 
Proportionate code P =X/m is derived from the Remnant code to identify and recover multiple faults. To 
simplify the complexity of circuit design, the implementation is carried out using the simple Adders (ADDs). 
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V. NUMERICAL EXAMPLE 
ORIGINAL DATA 
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Table 1. Accumulation of Original data having pixel value of 128 in the (1,1) position of a 4x4 Current Macro 

Block 
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Table 2. Injection of Fault data having pixel value of 120 in the (1,1) position of a 4x4 Current Macro Block 

VI. SIMULATION SET UP 
6.1. CREATING THE WORKING LIBRARY 

ModelSim is a verification and simulation tool for VHDL, Verilog, System Verilog, and mixed 
language designs. In ModelSim, all designs are compiled into a library. Typically start a new simulation in 
ModelSim by creating a working library called "work". "Work" is the library name used by the compiler as the 
default destination for compiled design units. 



6.2. COMPILING YOUR DESIGN 

After creating the working library, compile the design units into it. The Model Simlibrary format is 
compatible across all supported platforms. Simulate the design on any platform without having to recompile the 
design. Loading the Simulator with the Design and Running the Simulation. With the design compiled, load the 
simulator with respective design by invoking the simulator on a top-level module (Verilog) or a configuration or 
entity/architecture pair( VHDL). Assuming the design loads successfully, the simulation time is set to zero, and 
enter a run command to begin simulation. 

6.3. DEBUGGING RESULTS 

If we don't get the results as we expect, we can use Modelsim robust debugging environment to track 
down the cause of the problem. 



Issn 2250-3005 



Page 14 



A Novel Fault Recovery Scheme. 



VII. SIMULATION RESULTS 




Figure 2. Simulation Result of Processing Element Circuit obtaining Absolute Differences from the Current and 

Reference pixels. 




Figure 3. Simulation Result of SAD Generation Circuit obtaining its value from a suitable Candidate Block for a 

specific PE 
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Figure 4. Simulation Result of RPCG Circuit obtaining its value from lowest distortion SAD value 



llssn 2250-3005 



n 



Octoberl 1 2013 



n 



Page 15 



A Novel Fault Recovery Scheme... 




Figure 6. Simulation Result of Fault Detection Circuit identifying the fault using RP Codes and Test Codes for a 

Specific PE in a Motion Analysis Process 




Figure 7. Simulation Result of Data Recovery Circuit identifying the fault using Test Codes for a Specific PE in 

a Motion Analysis Process 
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Figure 8. Simulation Result of Fault Recovery design in identifying the fault and recovering the original data for 

a Specific PE in a Motion Analysis Process 




Figure 9. Simulation Result of Fault Recovery design in identifying the fault using RP Codes and Test Codes for 

a Specific PE in a Motion Analysis Process 
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VIII. RESULTS AND ITS DISCUSSION 

Table 3. Design Specifications of Fault Recovery Design with 16 PEs and 16 TCGs in a Motion Analysis 

Process for a specific Processing Element 



S.NO 


DESIGN SPECIFICATIONS 


1. 


Algorithm 


FtemnarLt Proportionate 
Code Generation Algorithm 


2. 


No. of Processing 
Elements 


1 6 (4x4 Array) 


3. 


Supported Block Size 


All Available Sizes 


4. 


Process Technology 


TSMC0.1S-,um 1P6M 
CMOS technology 


5 


Maximum Frequency 


53S_92 MHz 



Table 4. Performance Analysis of Fault Recovery Design with 16 PEs and 16 TCGs in a Motion Analysis 

Process for a specific Processing Element 



5. NO 


PERFORMANCE ANALYSIS 


1. 


No. of Test Patterns 


16 


2. 


Operating Speed 


10.973 ns (91.13 MHz) 


3. 


Area(Gate Counts) 


11726 


4. 


P ower C onsumpti on 


289.68 mW 


5 


Operating Conditions 


29°C 


5 


Fault Coverage 


100 % 



IX. CONCLUSION 

This work presents a novel Fault Recovery scheme for detecting the faults and recovering the data of 
PEs in a MA. Based on the RP code, a RPCG-based TCG design is developed to generate the corresponding test 
codes to identify faults and recover data. The proposed Fault Recovery scheme is also implemented by using 
Verilog Hardware Description Language and synthesized by the synopsys Design Compiler with TSMC 0.18- 
mlP6MCMOS technology. Experimental results indicate that that the proposed Fault Recovery design can 
effectively identify faults and will recover data in PEs of a MA with an operating time of 10.973ns with 
acceptable area cost. 
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